2,267 research outputs found
Visually Mining Interesting Patterns in Multivariate Datasets
Data mining for patterns and knowledge discovery in multivariate datasets are very important processes and tasks to help analysts understand the dataset, describe the dataset, and predict unknown data values. However, conventional computer-supported data mining approaches often limit the user from getting involved in the mining process and performing interactions during the pattern discovery. Besides, without the visual representation of the extracted knowledge, the analysts can have difficulty explaining and understanding the patterns. Therefore, instead of directly applying automatic data mining techniques, it is necessary to develop appropriate techniques and visualization systems that allow users to interactively perform knowledge discovery, visually examine the patterns, adjust the parameters, and discover more interesting patterns based on their requirements. In the dissertation, I will discuss different proposed visualization systems to assist analysts in mining patterns and discovering knowledge in multivariate datasets, including the design, implementation, and the evaluation. Three types of different patterns are proposed and discussed, including trends, clusters of subgroups, and local patterns. For trend discovery, the parameter space is visualized to allow the user to visually examine the space and find where good linear patterns exist. For cluster discovery, the user is able to interactively set the query range on a target attribute, and retrieve all the sub-regions that satisfy the user\u27s requirements. The sub-regions that satisfy the same query and are neareach other are grouped and aggregated to form clusters. For local pattern discovery, the patterns for the local sub-region with a focal point and its neighbors are computationally extracted and visually represented. To discover interesting local neighbors, the extracted local patterns are integrated and visually shown to the analysts. Evaluations of the three visualization systems using formal user studies are also performed and discussed
Model and Integrate Medical Resource Available Times and Relationships in Verifiably Correct Executable Medical Best Practice Guideline Models (Extended Version)
Improving patient care safety is an ultimate objective for medical
cyber-physical systems. A recent study shows that the patients' death rate is
significantly reduced by computerizing medical best practice guidelines. Recent
data also show that some morbidity and mortality in emergency care are directly
caused by delayed or interrupted treatment due to lack of medical resources.
However, medical guidelines usually do not provide guidance on medical resource
demands and how to manage potential unexpected delays in resource availability.
If medical resources are temporarily unavailable, safety properties in existing
executable medical guideline models may fail which may cause increased risk to
patients under care. The paper presents a separately model and jointly verify
(SMJV) architecture to separately model medical resource available times and
relationships and jointly verify safety properties of existing medical best
practice guideline models with resource models being integrated in. The SMJV
architecture allows medical staff to effectively manage medical resource
demands and unexpected resource availability delays during emergency care. The
separated modeling approach also allows different domain professionals to make
independent model modifications, facilitates the management of frequent
resource availability changes, and enables resource statechart reuse in
multiple medical guideline models. A simplified stroke scenario is used as a
case study to investigate the effectiveness and validity of the SMJV
architecture. The case study indicates that the SMJV architecture is able to
identify unsafe properties caused by unexpected resource delays.Comment: full version, 12 page
Generalized Hyper-cylinders: a Mechanism for Modeling and Visualizing N-D Objects
The display of surfaces and solids has usually been restricted to the domain of scientific visualization; however, little work has been done on the visualization of surfaces and solids of dimensionality higher than three or four. Indeed, most high-dimensional visualization focuses on the display of data points. However, the ability to effectively model and visualize higher dimensional objects such as clusters and patterns would be quite useful in studying their shapes, relationships, and changes over time.
In this paper we describe a method for the description, extraction, and visualization of N-dimensional surfaces and solids. The approach is to extend generalized cylinders, an object representation used in geometric modeling and computer vision, to arbitrary dimensionality, resulting in what we term Generalized Hyper-cylinders (GHCs). A basic GHC consists of two N-dimensional hyper-spheres connected by a hyper-cylinder whose shape at any point along the cylinder is determined by interpolating between the endpoint shapes. More complex GHCs involve alternate cross-section shapes and curved spines connecting the ends. Several algorithms for constructing or extracting GHCs from multivariate data sets are proposed. Once extracted, the GHCs can be visualized using a variety of projection techniques and methods toconvey cross-section shapes
Recommended from our members
Schrödinger equations with magnetic fields and Hardy-Sobolev critical exponents
This article is motivated by problems in astrophysics. We consider nonlinear Schrödinger equations and related systems with magnetic fields and Hardy-Sobolev critical exponents. Under proper conditions, existence of ground state solutions to these equations and systems are established
Distributionally Robust Machine Learning with Multi-source Data
Classical machine learning methods may lead to poor prediction performance
when the target distribution differs from the source populations. This paper
utilizes data from multiple sources and introduces a group distributionally
robust prediction model defined to optimize an adversarial reward about
explained variance with respect to a class of target distributions. Compared to
classical empirical risk minimization, the proposed robust prediction model
improves the prediction accuracy for target populations with distribution
shifts. We show that our group distributionally robust prediction model is a
weighted average of the source populations' conditional outcome models. We
leverage this key identification result to robustify arbitrary machine learning
algorithms, including, for example, random forests and neural networks. We
devise a novel bias-corrected estimator to estimate the optimal aggregation
weight for general machine-learning algorithms and demonstrate its improvement
in the convergence rate. Our proposal can be seen as a distributionally robust
federated learning approach that is computationally efficient and easy to
implement using arbitrary machine learning base algorithms, satisfies some
privacy constraints, and has a nice interpretation of different sources'
importance for predicting a given target covariate distribution. We demonstrate
the performance of our proposed group distributionally robust method on
simulated and real data with random forests and neural networks as
base-learning algorithms
- …